n Average - ent Learning

نویسنده

  • Sridhar Mahadevan
چکیده

Average-reward reinforcement learning (ARL) is an undiscounted optimality framework that is generally applicable to a broad range of control tasks. ARL computes gain-optimal control policies that maximize the expected payoff per step. However, gainoptimality has some intrinsic limitations as an optimality criterion, since for example, it cannot distinguish between different policies that all reach an absorbing goal state, but incur varying costs. A more selective criterion is bias optima&y, which can filter gain-optimal policies to select those that reach absorbing goals with the minimum cost. While several ARL algorithms for computing gain-optimal policies have been proposed, none of these algorithms can guarantee bias optimality, since this requires solving at least two nested optimality equations. In this paper, we describe a novel model-based ARL algorithm for computing bias-optimal policies. We test the proposed algorithm using an admission control queuing system, and show that it is able to utilize the queue much more efficiently than a gain-optimal method by learning bias-optimal policies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Retrospective analysis of use and distribution of resources in otolaryngology wards in Romanian hospitals between 2003 and 2008 to improve provision and financial performance of healthcare services.

AIM To analyze use and distribution of resources by otolaryngology (ENT) hospital wards in Romania between 2003 and 2008, in order to plan the improvement of patient access to health care services and health care services' financial performance. METHODS Clinical electronic records were searched for all patients discharged from all public hospitals funded on a per-case basis by the government ...

متن کامل

ccrABEnt serine recombinase genes are widely distributed in the Enterococcus faecium and Enterococcus casseliflavus species groups and are expressed in E. faecium

The presence, distribution and expression of cassette chromosome recombinase (ccr) genes, which are homologous to the staphylococcal ccrAB genes and are designated ccrAB(Ent) genes, were examined in enterococcal isolates (n=421) representing 13 different species. A total of 118 (28 %) isolates were positive for ccrAB(Ent) genes by PCR, and a number of these were confirmed by Southern hybridizat...

متن کامل

Mutual Information and Bayes Methods for Learning a Distribution

Each parameter w in an abstract parameter space W is associated with a di er ent probability distribution on a set Y A parameter w is chosen at random from W according to some a priori distribution on W and n conditionally indepen dent random variables Y n Y Yn are observed with common distribution determined by w Viewing W as a random variable we obtain bounds on the mutual information between...

متن کامل

Role of the Tectorial Membrane Revealed by Otoacoustic Emissions Recorded From Wild-Type and Transgenic Tecta ENT/ ENT Mice

Lukashkin, Andrei N., Victoria A. Lukashkina, P. Kevin Legan, Guy P. Richardson, and Ian J. Russell. Role of the tectorial membrane revealed by otoacoustic emissions recorded from wild-type and transgenic Tecta ENT/ ENT mice. J Neurophysiol 91: 163–171, 2004. First published October 1, 2003; 10.1152/jn.00680.2003. Distortion product otoacoustic emissions (DPOAE) were recorded from wildtype mice...

متن کامل

Addressing Limited Data for Textual Entailment Across Domains

We seek to address the lack of labeled data (and high cost of annotation) for textual entailment in some domains. To that end, we first create (for experimental purposes) an entailment dataset for the clinical domain, and a highly competitive supervised entailment system, ENT, that is effective (out of the box) on two domains. We then explore self-training and active learning strategies to addr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999